Skip to content

Conversation

@micro-arm
Copy link
Contributor

Arm: Add SVE implementations of simdFilterHor

  • Add SVE implementations of the simdInterpolateHor filters for the 4-tap, 6-tap, and 8-tap.

  • Performance uplift vs Neon:
    N8: 1.36x
    N6: 1.09x
    N4: 1.09x

  • These benchmarks are obtained from a Neoverse V2 using LLVM 21.


Arm: Refactor simdFilter_neon for width = 1 path

  • The M1 paths for the simdFilter_neon currently use the scalar code for the general filter case. Refactoring it to specialize for case width == 1 improves the average performance by 2.0x across all taps compared to the current scalar implementation.

@micro-arm micro-arm marked this pull request as draft October 15, 2025 12:40
@micro-arm micro-arm force-pushed the interpolate-filterhor-sve branch from 7b02b70 to a0ce280 Compare October 15, 2025 13:31
@micro-arm micro-arm marked this pull request as ready for review October 15, 2025 13:54
@micro-arm
Copy link
Contributor Author

Just forced-pushed a fix for a compiler error found in the SVE. I guess none of the CI can compile the SVE paths yet?

@micro-arm micro-arm marked this pull request as draft October 30, 2025 20:05
@micro-arm micro-arm force-pushed the interpolate-filterhor-sve branch from a0ce280 to c8583f8 Compare October 31, 2025 18:18
The M1 paths for the simdFilter_neon currently use the scalar code for
the general filter case. Refactoring it to specialize for case
`width == 1` improves the average performance by 2.0x across all taps
compared to the current scalar implementation.
Add SVE implementations of the `simdInterpolateHor` filters for the
4-tap, 6-tap, and 8-tap.

Performance uplift vs Neon:
 N8: 1.36x
 N6: 1.09x
 N4: 1.09x

These benchmarks are obtained from a Neoverse V2 using LLVM 21.
@micro-arm micro-arm force-pushed the interpolate-filterhor-sve branch from c8583f8 to b552c23 Compare October 31, 2025 18:28
@micro-arm micro-arm marked this pull request as ready for review October 31, 2025 18:29
@K-os K-os merged commit e20acf8 into fraunhoferhhi:master Nov 21, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants